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ABSTRACT : 

PROBLEM TO BE SOLVED: To improve recognition precision even with a 
character 

of low quality by providing an extension dictionary, and storing 
information 

based on the character pattern of a character which cannot be correctly 
recognized by using a standard dictionary in the extension dictionary. 

SOLUTION: In this character recognition device, a standard dictionary 5 
stores standard information for recognition of a specified character for 
every 

character in advance. A picture input part 1 inputs a picture, and a 
preprocessing part 2 cuts out a character from the inputted picture. A 
recognition processing part 3 compares and collates information, based on 
the 

character pattern of an unknown character with/to standard information of 
every 

character stored in the standard dictionary 5 in advance and a character 
whose 

degree of similarity is the largest is outputted as a recognition result. 
A 
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control part 4 controls a whole. Also, the character recognition device is 
provided, with an extension dictionary 6 in addition to the standard 
dictionary 

5 and the extension dictionary 6 stores information based on the character 
pattern of a character which can not be correctly recognized by using the 
standard dictionary 5. Especially, only information based on the character 
pattern of Kanji (Chinese character) is stored. 
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* NOTICES * 

JPO and NCIPI are not responsible for any 
damages caused by the use of this translation. 

1. This document has been translated by computer. So the translation may not reflect the original 
precisely. 

2. **** shows the word which can not be translated. 
3. In the drawings, any words are not translated. 



DETAILED DESCRIPTION 



[Detailed Description of the Invention] 
[0001] 

[Field of the Invention] This invention relates to a character reader, the dictionary creation approach, 

and a record medium. 

[0002] 

[Description of the Prior Art] The technique of preventing the fall of recognition precision is shown by 
by having the 1st dictionary which contains an alphabetic character with the high frequency of 
occurrence as a dictionary for recognition, and the 2nd dictionary containing an alphabetic character 
with the low frequency of occurrence, collating the characteristic quantity of a strange character pattern 
with the description in which it was first stored by the 1st dictionary, and collating the 2nd dictionary 
with the former, for example, JP,4-242494,A, by the judgment result about the recognition result. 
[0003] Moreover, alphabetic character logging is performed anew and it is made to perform recognition 
processing in the former, when starting an alphabetic character from an input image, performing 
recognition processing and it judges that the alphabetic character logging location is wrong, as a result 
of performing recognition processing using a standard dictionary using a standard dictionary again. That 
is, what is depended on the alphabetic character not being started correctly as a cause of incorrect 
recognition, and an alphabetic character can be divided roughly into two of not matching, although 
started correctly. New recognition [ of judging the alphabetic character which performs recognition 
processing in a standard dictionary first supposing the case where the alphabetic character is not 
correctly started by the method of the former mentioned above, and is correctly started neither from the 
similarity nor a character size, performing alphabetic character logging processing again, performing 
recognition processing with a new alphabetic character rectangle coordinate, using a standard dictionary 
again, and comparing the 1st result with the 2nd result ] processing is performed. 
[0004] 

[Problem(s) to be Solved by the Invention] However, with the technique shown in JP,4-242494,A 
mentioned above, although two dictionaries are formed, since these two dictionaries were divided by the 
frequency of occurrence, they were not able to raise the recognition precision about a low quality 
alphabetic character. 

[0005] Moreover, when it judges that the alphabetic character logging location is wrong, alphabetic 

character logging is performed anew, and in the conventional method which it has a new appreciation of 

using a standard dictionary, since recognition processing was performed only using the standard 

dictionary, the recognition precision was not able to be raised about a low quality alphabetic character. 

That is, the character recognition location shifted and there was a problem that recognition processing 

could not be performed correctly, about an alphabetic character of inferior quality. 

[0006] Especially this invention aims at offering the possible character reader, the dictionary creation 

approach, and record medium of raising recognition precision also about a low quality alphabetic 

character. 

[0007] 
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[Means for Solving the Problem] In order to attain the above-mentioned purpose, invention according to 
claim 1 The standard dictionary in which the information based on the character pattern of a 
predetermined alphabetic character is beforehand memorized for every alphabetic character, 
Comparison collating of the alphabetic character **** means which starts an alphabetic character from 
an input image, and the information for every alphabetic character beforehand memorized by the 
information based on a character pattern and the standard dictionary of a strange alphabetic character is 
carried out, respectively. In the character reader which has a recognition processing means to output the 
l argest alphabetic character o lsi milarity a s a recognition result, the extended dictionary ot her than said 
standard dictionary is formed further. In this extended dictionary It is characterized by memorizing the > 
information based on the character pattern of this alphabetic character about the alphabetic character 
which was not able to be correctly identified using the standard dictionary. 

[0008] Moreover, the standard dictionary in which the characteristic quantity with which invention 
according to claim 2 was extracted from the character pattern of a predetermined alphabetic character is 
beforehand memorized for every alphabetic character, Extract characteristic quantity from the alphabetic 
character **** means which starts an alphabetic character from an input image, and the character pattern 
of a strange alphabetic character, and comparison collating of the characteristic quantity for every 
alphabetic character beforehand memorized by this characteristic quantity and the standard dictionary is 
carried out, respectively. In the character reader which has a recognition processing means to output the 
largest alphabetic character of similarity as a recognition result, the extended dictionary other than said 
standard dictionary is formed further. In this extended dictionary It is characterized by memorizing the 
characteristic quantity extracted from the character pattern of this alphabetic character about the 
alphabetic character which was not able to be correctly identified using the standard dictionary. 
[0009] Moreover, the standard dictionary in which, as for invention according to claim 3, the character 
pattern of a predetermined alphabetic character is beforehand memorized for every alphabetic character, 
Comparison collating of the alphabetic character **** means which starts an alphabetic character from 
an input image, and the character pattern for every alphabetic character beforehand memorized by the 
character pattern and standard dictionary of a strange alphabetic character is carried out, respectively. In 
the character reader which has a recognition processing means to output the largest alphabetic character 
of similarity as a recognition result, the extended dictionary other than said standard dictionary is formed 
further. In this extended dictionary About the alphabetic character which was not able to be correctly 
identified using the standard dictionary, it is characterized by memorizing the character pattern of this 
alphabetic character. 

[0010] Moreover, invention according to claim 4 is characterized by memorizing the information based 
on the character pattern about all the alphabetic characters for recognition at the standard dictionary in 
the character reader given in any 1 term of claim 1 thru/or claim 3. 

[001 1] Moreover, invention according to claim 5 is characterized by drawing up said standard dictionary 
only from the good character pattern of alphabetic character quality in the character reader given in any 
1 term of claim 1 thru/or claim 4. 

[0012] Moreover, invention according to claim 6 is characterized by memorizing only the information 
based on the character pattern about some alphabetic characters for recognition of all the alphabetic 
characters for recognition at said extended dictionary in the character reader given in any 1 term of 
claim 1 thru/or claim 3. 

[0013] Moreover, invention according to claim 7 is characterized by memorizing only the information 
based on the character pattern of the kanji in the character reader according to claim 6 at said extended 
dictionary. 

[0014] Moreover, it is characterized by for invention according to claim 8 performing recognition 
processing in a standard dictionary first to the character pattern of a strange alphabetic character in a 
character reader given in any 1 term of claim 1 thru/or claim 7, consequently performing recognition 
processing further using an extended dictionary, when similarity is smaller than a predetermined 
threshold. 

[0015] Moreover, when invention according to claim 9 performs recognition processing in a standard 
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dictionary first to the character pattern of a strange alphabetic character in a character reader given in 
any 1 term of claim 1 thru/or claim 7, consequently it is judged that the alphabetic character logging 
location from the input image of this alphabetic character is not right, it is characterized by performing 
recognition processing using an extended dictionary. 

[0016] Moreover, invention according to claim 10 is set to a character reader given in any 1 term of 
claim 1 thru/or claim 7. When a standard dictionary performs recognition processing first, consequently 
it is judged to the character pattern of a strange alphabetic character that the alphabetic character logging 
location from the input image of this alphabetic character is not right It is characterized by performing 
recognition processing using two dictionaries of a standard dictionary and an extended dictionary to the 
character pattern of the alphabetic character which started the alphabetic character in a new logging 
location, and was started in a new logging location. 

[0017] Moreover, invention according to claim 1 1 is set to a character reader given in any 1 term of 
claim 1 thru/or claim 7. Recognition processing is performed to the character pattern of a strange 
alphabetic character using a standard dictionary and an extended dictionary, respectively. The similarity 
which it is as a result of [ which was obtained using the standard dictionary ] recognition, and the 
similarity which it is as a result of [ which was obtained using the extended dictionary ] recognition are 
measured, and it is characterized by outputting the larger one of similarity as a last recognition result. 
[0018] Moreover, the standard dictionary in which the information based on the character pattern of a 
predetermined alphabetic character in invention according to claim 12 is beforehand memorized for 
every alphabetic character, Comparison collating of the alphabetic character **** means which starts an 
alphabetic character from an input image, and the information for every alphabetic character beforehand 
memorized by the information based on a character pattern and the standard dictionary of a strange 
alphabetic character is carried out, respectively. In the character reader which has a recognition 
processing means to output the largest alphabetic character of similarity as a recognition result 
Sequential change of the binary-ized threshold of a multiple- value image is carried out to the value 
which becomes blurred from the value by which a binary image is crushed. It is characterized by 
drawing up an extended dictionary using the binary character pattern in the threshold which was made to 
recognize the binary image obtained when changing a binary-ized threshold one by one using a standard 
dictionary, and has been begun and recognized using a standard dictionary, and the binary character 
pattern in the threshold which has been recognized at the last. 

[0019] Moreover, the standard dictionary in which the information based on the character pattern of a 
predetermined alphabetic character in invention according to claim 13 is beforehand memorized for 
every alphabetic character, Comparison collating of the alphabetic character **** means which starts an 
alphabetic character from an input image, and the information for every alphabetic character beforehand 
memorized by the information based on a character pattern and the standard dictionary of a strange 
alphabetic character is carried out, respectively. In the character reader which has a recognition 
processing means to output the largest alphabetic character of similarity as a recognition result 
Sequential change of the binary-ized threshold of a multiple-value image is carried out to the value 
which becomes blurred from the value by which a binary image is crushed. It is characterized by 
drawing up an extended dictionary using the binary character pattern in the threshold which was made to 
recognize the binary image obtained when changing a binary-ized threshold one by one using a standard 
dictionary, and has been begun and recognized using a standard dictionary. 

[0020] Moreover, the standard dictionary in which the information based on the character pattern of a 
predetermined alphabetic character in invention according to claim 14 is beforehand memorized for 
every alphabetic character, Comparison collating of the alphabetic character **** means which starts an 
alphabetic character from an input image, and the information for every alphabetic character beforehand 
memorized by the information based on a character pattern and the standard dictionary of a strange 
alphabetic character is carried out, respectively. In the character reader which has a recognition 
processing means to output the largest alphabetic character of similarity as a recognition result 
Sequential change of the binary-ized threshold of a multiple-value image is carried out to the value 
which becomes blurred from the value by which a binary image is crushed. It is characterized by 
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drawing up an extended dictionary using the binary character pattern in the threshold which was made to 
recognize the binary image obtained when changing a binary-ized threshold one by one using a standard 
dictionary, and has been recognized at the end. 

[0021] Moreover, invention according to claim 15 is characterized by this computer recording the 
program for making a computer perform a character reader given in any 1 term of claim 1 thru/or claim 
1 1 on a record medium possible [ reading ]. 

[0022] Moreover, invention according to claim 16 is characterized by recording the program for making 
a computer perform the dictionary creation approach given in any 1 term of claim 12 thru/or claim 14 on 
the record medium which said computer can read. 

[0023] Moreover, invention according to claim 17 is characterized by drawing up a standard dictionary 
and/or an extended dictionary based on the character pattern of the predetermined alphabetic character 
which the character pattern of a predetermined alphabetic character is stored in the server machine 
connected in the network, and is memorized by said server machine on another client machine. 
[0024] 

[Embodiment of the Invention] Hereafter, the operation gestalt of this invention is explained based on a 
drawing. Drawing 1 is the block diagram of the character reader concerning this invention. The standard 
dictionary 5 in which, as for this character reader, the standard information for recognition processing of 
a predetermined alphabetic character is beforehand memorized for every alphabetic character if drawing 
1 is referred to, The image input section 1 which inputs an image, and the pretreatment section 2 which 
starts an alphabetic character from input images (for example, scanner etc.), Comparison collating of the 
standard information for every alphabetic character beforehand memorized by the information based on 
a character pattern and the standard dictionary 5 of a strange alphabetic character is carried out, 
respectively. It has the recognition processing section 3 which outputs the largest alphabetic character of 
similarity as a recognition result, and the control section 4 which controls the whole. In this character 
reader Further, the extended dictionary 6 other than the standard dictionary 5 is formed, and the 
information based on the character pattern of this alphabetic character is memorized by this extended 
dictionary 6 about the alphabetic character which was not able to be correctly identified using the 
standard dictionary 5. 

[0025] In more detail, the information based on the character pattern about all the alphabetic characters 
for recognition is memorized by the standard dictionary 5, and the standard dictionary 5 is drawn up 
only from the good character pattern of alphabetic character quality. Moreover, only the information 
based on the character pattern about some alphabetic characters for recognition of all the alphabetic 
characters for recognition is memorized by the extended dictionary 6. Especially, only the information 
based on the character pattern of the kanji is memorized. 

[0026] In addition, as information based on the character pattern memorized by the standard dictionary 5 
and the extended dictionary 6, the case where it is the character pattern itself, and the case where it is the 
characteristic quantity extracted from the character pattern can be considered. 

[0027] When the information based on the character pattern memorized by the standard dictionary 5 and 
the extended dictionary 6 is the characteristic quantity extracted from the character pattern, the 
recognition processing section 3 extracts characteristic quantity from the character pattern of a strange 
alphabetic character, and carries out comparison collating of the characteristic quantity for every 
alphabetic character beforehand memorized by this characteristic quantity, the standard dictionary 5, and 
the extended dictionary 6, respectively. Moreover, when the information based on the character pattern 
memorized by the standard dictionary 5 and the extended dictionary 6 is the character pattern itself, the 
recognition processing section 3 carries out comparison collating of the character pattern for every 
alphabetic character beforehand memorized by the strange character pattern of an alphabetic character 
and the strange standard dictionary 5, and the extended dictionary 6, respectively. This invention is 
applicable to all of the case where it is the character pattern itself, and the case where it is the 
characteristic quantity extracted from the character pattern, as information based on the character pattern 
memorized by the standard dictionary 5 and the extended dictionary 6. 

[0028] Moreover, a control section 4 performs control of the image input section 1, the pretreatment 
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section 2, and the recognition processing section 3, or performs change control with the standard 
dictionary 5 and the extended dictionary 6. 

[0029] Next, processing actuation of the character reader of the above configurations is explained. In 
this character reader, a manuscript, a document, etc. to perform recognition processing are first read by 
the image input section 1 . Thus, in the pretreatment section 2, if an input image is read, when the read 
manuscripts are a color and a multiple- value image manuscript, binary-ized processing will be 
performed or field discernment processing in which a table, drawing, etc. are discriminated from an 
alphabetic character, line logging processing which starts a line to an alphabetic character field, 
alphabetic character logging processing which starts an alphabetic character will be performed. Thus, 
finally in the pretreatment section 2, the alphabetic character (alphabetic character image) which serves 
as a candidate for recognition from an input image is started and outputted. The pretreatment section 2 
includes the function as an alphabetic character slicing part so that this may show. 
[0030] Next, in the recognition processing section 3, the information based on the character pattern of 
the alphabetic character started in the pretreatment section 2 is collated with the standard dictionary 5 
and/or the extended dictionary 6. 

[003 1] Drawing 2 is a flow chart which shows the processing flow of the recognition processing section 
3. If drawing 2 is referred to and an alphabetic character will be started in the pretreatment section 2, in 
the recognition processing section 3, comparison collating of the information based on the character 
pattern of the started alphabetic character will be first carried out with the standard dictionary 5 (step 
SI). In this comparison collating, comparison collating with the information about all the alphabetic 
characters in the standard dictionary 5 is made, and the alphabetic character which gives the largest 
similarity is outputted as a recognition result of the standard dictionary 5. 
[0032] Thus, when the alphabetic character with the standard dictionary 5 which gives the largest 
similarity (it considers as similarity 1) is obtained as a result of collating, the threshold beforehand 
determined as this similarity 1 is compared (step S2). Consequently, when similarity 1 is larger than a 
threshold, comparison collating with the extended dictionary 6 is not performed, but let a comparison 
collating result (namely, alphabetic character which gives the largest similarity (similarity 1)) with the 
standard dictionary 5 be the last recognition result (step S3). 

[0033] On the other hand, in step S2, when similarity 1 is not larger than a threshold, comparison 
collating with the extended dictionary 6 is performed further (step S4). In this comparison collating, 
comparison collating with the information about all the alphabetic characters in the extended dictionary 
6 is made, and the alphabetic character which gives the largest similarity is outputted as a recognition 
result of the extended dictionary 6. 

[0034] Thus, when the alphabetic character with the extended dictionary 6 which gives the largest 
similarity (it considers as similarity 2) is obtained as a result of collating, this similarity 2 and the 
similarity 1 called for at step SI are measured (step S5). Consequently, when the similarity 2 is larger 
than similarity 1, the recognition result in the extended dictionary 6 is made into the last recognition 
result (step S6), and when the similarity 1 is larger than similarity 2, let the recognition result in the 
standard dictionary 5 be the last recognition result (step S3). 

[0035] here, the extended dictionary 6 made the alphabetic character image of various image quality 
recognize in the standard dictionary 5, and has not been recognized in the standard dictionary 5 — low ~ 
it is created from a quality alphabetic character image. 

[0036] Therefore, since it is recognized as what the image with sufficient image quality (manuscript) is 
collating with the standard dictionary 5 currently drawn up from the image with sufficient image quality, 
and has bigger similarity than a threshold, it can be rare to collate with the extended dictionary 6, and it 
can short-**** the recognition processing time. 

[0037] Moreover, a low quality image (manuscript) is that comparison collating with the extended 
dictionary 6 is made, and highly precise recognition is attained. 

[0038] thus, in this invention, further, the extended dictionary 6 other than the standard dictionary 5 was 
formed, and the extended dictionary 6 has not been recognized in the standard dictionary 5, as a result of 
making the alphabetic character image of various image quality recognize in the standard dictionary 5 — 
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low ~ since it is created from a quality alphabetic character image, recognition precision can be raised 
also about a low quality alphabetic character. Furthermore, by forming such an extended dictionary 6, 
for example, a character recognition location shifts, and recognition processing about an alphabetic 
character of inferior quality can also be coped with (application). 

[0039] That is, when the standard dictionary 5 performs recognition processing first, consequently it is 
judged to the character pattern of a strange alphabetic character as 1st example of application that the 
alphabetic character logging location from the input image of this alphabetic character is not right, the 
extended dictionary 6 is used and recognition processing can be performed. 

[0040] Moreover, the character pattern of a strange alphabetic character is received as 2nd example of 
application. When the standard dictionary 5 performs recognition processing first, consequently it is 
judged that the alphabetic character logging location from the input image of this alphabetic character is 
not right Recognition processing can be performed using two dictionaries of the standard dictionary 5 
and the extended dictionary 6 to the character pattern of the alphabetic character which started the 
alphabetic character in a new logging location, and was started in a new logging location. That is, by the 
pattern registered into the standard dictionary 5, in spite of being correctly started as a character size, 
when it cannot match (that is, similarity is low), in a standard dictionary, it can recognize correctly using 
the extended dictionary 6 currently drawn up from the pattern (for example, the crushed pattern and the 
blurred character pattern) which is not registered by performing new recognition processing with the 
same alphabetic character logging rectangle coordinate as the former. 

[0041] The character pattern of a strange alphabetic character is received like the 1st and 2nd example of 
application. When the standard dictionary 5 performs recognition processing first, consequently it is 
judged that the alphabetic character logging location from the input image of this alphabetic character is 
not right By performing recognition processing using the extended dictionary 6, for example, a character 
recognition location can shift (not only the standard dictionary 5 but the extended dictionary 6 is utilized 
like before), and recognition precision can be raised also about an alphabetic character of inferior 
quality. 

[0042] If it puts in another way, with the character reader of this invention, fundamentally, the standard 
dictionary 5 performs recognition processing first, when the reliability (reliability) of a recognition result 
is low, the extended dictionary 6 will be used and recognition processing will be performed. Since the 
manuscript with good quality only matches only the standard dictionary 5 by this and it is not necessary 
to perform matching with the extended dictionary 6, processing speed serves as a high speed. 
[0043] Although the standard dictionary 5 needs to contain all the character codes (for example, a 
notation, a hiragana, katakana, kanji, etc.) of the alphabetic character for recognition since it is used by 
such character reader Since the extended dictionary 6 makes applicable to recognition only the pattern 
with which quality deteriorated, by not registering, the patterns (for example, a notation, a hiragana, 
katakana, etc.) with which degradation cannot take place comparatively easily are reducing the count of 
matching in the extended dictionary 6, and can time improvement in the speed of processing speed. As 
an alphabetic character which needs to register with the extended dictionary 6 that it is easy to produce 
degradation, the "kanji" is mentioned, for example. The "kanji" is for there being many stroke counts, 
and being crushed or becoming blurred compared with a hiragana, katakana, etc., in many cases. 
Moreover, as characteristic quantity used for character recognition processing, in the case of the profile 
description of an alphabetic character, it may use that degradation tends to produce the alphabetic 
character with a complicated profile, and you may judge with the characteristic quantity used for 
recognition. Moreover, you may judge using the description which is not used for recognition, for 
example, the number of black pixels etc. 

[0044] Moreover, in this invention, the extended dictionary 6 is drawn up using a system as shown in 
drawing 3 . The system of drawing 3 has the storage sections 40, such as a hard disk which saves the 
dictionary which saved and created alphabetic character image data, a floppy disk drive (FD drive), and 
a CD-ROM drive, the computer section (CPU, ROM, RAM) 41 which performs dictionary creation 
based on alphabetic character image data. In addition, the storage section 40 is incorporated into the 
computer section 41, may be united, or as long as it can access through networks, such as the Internet, 
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with a communication device 42, it may be prepared in other equipments tied with the communication 
line. 

[0045] Creation of the extended dictionary 6 is made as follows using a system as shown in drawing 3 . 
That is, the alphabetic character image data of a multiple value is first read from the storage sections 40, 
such as a hard disk. The image data read here presupposes that 256 gradation quantizes (16, 64, etc. are 
sufficient as the phase of quantization.). Moreover, you may be color data. Sequential change of the 
binary-ized threshold is carried out from zero to 255, and this image data is made binary. The binary 
image of "white" and "black" is made by making it binary. The binary image made here turns into a 
crushed alphabetic character image with a binary-ized threshold, or turns into a blurred alphabetic 
character image. 

[0046] Character recognition processing is carried out using the standard dictionary 5 first made from 
the character pattern of good quality in this binary image. Generally, the image which crushes or 
becomes blurred cannot be recognized in the standard dictionary 5. Then, the case where create a binary 
image and recognition processing is carried out is explained to an example, changing a binary-ized 
threshold in the direction of a blur (direction where a binary-ized threshold is big) from crushing 
(direction where a binary-ized threshold is small). As a result of changing a threshold in the direction of 
a sequential blur from crushing and carrying out recognition processing, it is crushed and let the binary 
image which was begun and has been recognized correctly be a marginal pattern. Recognition 
processing is performed changing a threshold furthermore, and it becomes blurred and let the binary 
image which has been recognized correctly at the end be a marginal pattern. This situation is shown in 
drawing 4 . Here, the created crushing marginal pattern and the dictionary which is worn and is drawn 
up from two patterns of a marginal pattern turn into the extended dictionary 6. 

[0047] Drawing 5 is a flow chart which shows the processing flow of dictionary creation of the extended 
dictionary 6. Reference of drawing 5 initializes the binary-ized threshold th to "0" first (step SI). 
Subsequently, only "1" carries out stepping of the binary-ized threshold th (step S2). After an 
appropriate time, a gray image is made binary with the binary-ized threshold th (step S3), and 
recognition processing is carried out in the standard dictionary 5 to this binary-ized image (step S4). 
Consequently, it judges whether it is a crushing marginal image or a blur marginal image as this binary- 
ized image showed to drawing 4 (step S5). 

[0048] A binary-ized image is crushed, in not being a marginal image or a blur marginal image, the 
binary-ized threshold th judges whether it is below "256" (step S6), and when the binary-ized threshold 
th is not below "256", only "1" carries out stepping of return and the binary-ized threshold th to step S2, 
and the same processing is repeated and is performed. 

[0049] And at step S5, when it is judged that a binary-ized image is crushed and they are a marginal 
image or a blur marginal image, it is saved, using this crushing marginal image or a blur marginal image 
as a degradation pattern image (step S7). (registration) Subsequently, it progresses to step S6 and the 
binary-ized threshold th judges whether it is below "256" (step S6), and when the binary-ized threshold 
th is not below "256", only "1" carries out stepping of return and the binary-ized threshold th to step S2, 
and the same processing is repeated and is performed. 

[0050] It carries out by repeating such processing, and when the binary-ized threshold th stops being 
below "256" at step S6, the extended dictionary 6 is drawn up based on the degradation pattern image 
saved at step S7 (registration) (step S8). 

[0051] The range of the extended dictionary 6 drawn up in this way which can be recognized is shown 
in drawing 6 . The range of the standard dictionary 5 which can be recognized is expandable by using 
the extended dictionary 6 with the range of the extended dictionary 6 which can be recognized so that 
drawing 6 may show. 

[0052] Thus, the standard dictionary 5 in which the information based on the character pattern of a 
predetermined alphabetic character is beforehand memorized for every alphabetic character in this 
invention, Comparison collating of the alphabetic character **** means which starts an alphabetic 
character from an input image, and the information for every alphabetic character beforehand 
memorized by the information based on a character pattern and the standard dictionary of a strange 
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alphabetic character is carried out, respectively. In the character reader which has a recognition 
processing means to output the largest alphabetic character of similarity as a recognition result 
Sequential change of the binary-ized threshold of a multiple-value image is carried out to the value 
which becomes blurred from the value by which a binary image is crushed. The binary character pattern 
in the threshold which was made to recognize the binary image obtained when changing a binary-ized 
threshold one by one using the standard dictionary 5, and has been begun and recognized using the 
standard dictionary 5 (crashing pattern), The extended dictionary 6 can be drawn up using the binary 
character pattern (blur pattern) in the threshold which has been recognized at the end. 
[0053] Or the extended dictionary 6 can also be drawn up using the binary character pattern (crushing 
pattern) in the threshold which was made to carry out sequential change of the binary-ized threshold of a 
multiple- value image to the value which becomes blurred from the value by which a binary image is 
crushed, was made to recognize the binary image obtained when changing a binary-ized threshold one 
by one using the standard dictionary 5, and has been begun and recognized using the standard dictionary 
5. 

[0054] Or the extended dictionary 6 can also be drawn up using the binary character pattern (blur 
pattern) in the threshold which was made to carry out sequential change of the binary-ized threshold of a 
multiple- value image to the value which becomes blurred from the value by which a binary image is 
crushed, was made to recognize the binary image obtained when changing a binary-ized threshold one 
by one using the standard dictionary 5, and has been recognized at the end using the standard dictionary 
5. 

[0055] Moreover, when drawing up the standard dictionary 5 and the extended dictionary 6, the 
character pattern of a predetermined alphabetic character can be stored in the server machine connected 
in the network, and the standard dictionary 5 and/or the extended dictionary 6 can also be drawn up on 
another client machine based on the character pattern of the predetermined alphabetic character 
memorized by said server machine. 

[0056] Moreover, drawing 7 is drawing showing the example of a hardware configuration of the 
character reader of drawing 1 . When drawing 7 is referred to, this character reader For example, CPU21 
which is realized by the personal computer etc. and controls the whole, RAM23 used as a work area of 
ROM22 and CPU21 where the control program of CPU21 etc. is memorized, As a result of outputting 
the information on a result that recognition processing was performed to the scanner 24 which reads a 
document as a document image, and each alphabetic character image contained in the document image, 
it has the output unit (for example, a display and a printer) 26. 

[0057] Here, CPU21 has the function of the control section 4 of drawing 1 , the pretreatment section 2, 
and the recognition processing section 3. Moreover, in RAM23, the standard dictionary 5 and the 
extended dictionary 6 are storable. 

[0058] It can provide in the form of a software package (specifically information record media, such as 
CD-ROM), and when the information record medium 30 makes it set in the example of drawing 3 for 
this reason, as for the function as such a control section 4 in CPU21, the pretreatment section 2, and 
recognition processing section 3 grade, the medium driving gear 3 1 which drives this is formed. 
[0059] If it puts in another way, the character reader of this invention can be carried out also in the 
equipment configuration which makes the program recorded on the general-purpose computing system 
equipped with the image scanner, the display, etc. by information record media, such as CD-ROM, read, 
and makes the microprocessor of this general purpose computer system perform pretreatment and 
recognition processing. In this case, the program (namely, program used with a hardware system) for 
performing pretreatment of this invention and recognition processing is offered in the condition of 
having been recorded on the medium. As an information record medium with which a program etc. is 
recorded, it is not restricted to CD-ROM and ROM, RAM, a flexible disk, a memory card, etc. may be 
used. By being installed in the store built into the hardware system, for example, a hard disk drive unit, 
the program recorded on the medium performs this program, and can realize a pretreatment function and 
a recognition processing facility. 

[0060] In addition, although drawing 3 is mentioned as an example of the structure of a system used for 
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dictionary creation and drawing 7 is mentioned as an example of a hardware configuration of the 
character reader of drawing 1 , in drawing 3 and drawing 7 , CPU, ROM, RAM, or the storage section 
40 can be shared, and drawing 3 and drawing 7 can also consist of above-mentioned explanation as one 
equipment. 
[0061] 

[Effect of the Invention] The recognition precision of a low quality alphabetic character can also be 
raised without reducing a recognition rate by performing recognition processing besides a standard 
dictionary also using the extended dictionary which registered the character pattern which has not been 
recognized according to claim 1 thru/or claim 11, and invention according to claim 15, as explained 
above. That is, the image with sufficient image quality can offer the character reader which can be 
recognized with high degree of accuracy also with a low quality image at a high speed. 
[0062] Moreover, according to claim 12 thru/or claim 14, and invention according to claim 16 to 17, 
dictionary creation which can be recognized can be performed for the character pattern in which quality 
deteriorated at high speed and with high precision. 



[Translation done.] 
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* NOTICES * 

JPO and NCI PI are not responsible for any 
damages caused by the use of this translation. 

1. This document has been translated by computer. So the translation may not reflect the original 
precisely. 

2. **** shows the word which can not be translated. 
3. In the drawings, any words are not translated. 



DESCRIPTION OF DRAWINGS 



[Brief Description of the Drawings] 

[Drawing 1] It is the block diagram of the character reader concerning this invention. 
[Drawing 2] It is the flow chart which shows the processing flow of the recognition processing section. 
[Drawing 3] It is drawing showing the example of the structure of a system used for dictionary creation. 
[Drawing 4] It is drawing for explaining preservation (registration) of a crushing marginal image or a 
blur marginal image. 

[Drawing 5] It is the flow chart which shows the processing flow of dictionary creation of an extended 
dictionary. 

[Drawing 6] It is drawing showing the range of the drawn-up extended dictionary which can be 
recognized. 

[Drawing 7] It is drawing showing the example of a hardware configuration of the character reader of 
drawing 1 . 

[Description of Notations] 

1 Image Input Section 

2 Pretreatment Section 

3 Recognition Processing Section 

4 Control Section 

/ 5 Standard Dictionary 
\ 6 Extended Dictionary 

21 CPU 

22 ROM 

23 RAM 

24 Scanner 

26 Result Output Unit 

30 Information Record Medium 

3 1 Medium Driving Gear 

40 Storage Section 

41 Computer Section 

42 Communication Device 



[Translation done.] 
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